Quantifying cross-linguistic variation in grapheme-to-phoneme mapping

نویسندگان

  • Martine Coene
  • Annemiek Hammer
  • Wojtek Kowalczyk
  • Louis ten Bosch
  • Bart Vaerenberg
  • Paul Govaerts
چکیده

In the literature, languages have been identified as having more or less transparent orthographies, depending on the degree of predictability of their spelling-to-sound correspondences. Quantitative measures based on large-scaled language corpora which are capable to objectively assess such cross-linguistic variation are rather scarce. The quantitative assessment method presented here builds on the correlation between distances of phonemic and graphemic frequency distributions of a given sample and similar distances obtained from large corpora of the same language. The metric itself may be used as a research tool to investigate the potential effect of orthographic transparency on the development and performance of reading in different populations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A latent analogy framework for grapheme-to-phoneme conversion

Data-driven grapheme-to-phoneme conversion involves either (top-down) inductive learning or (bottom-up) pronunciation by analogy. As both approaches rely on local context information, they typically require some external linguistic knowledge, e.g., individual grapheme/phoneme correspondences. To avoid such supervision, this paper proposes an alternative solution, dubbed pronunciation by latent ...

متن کامل

Data-Oriented Methods for Grapheme-to-Phoneme Conversion

It is traditionally assumed that various sources of linguistic knowledge and their interaction should be formalised in order to be able to convert words into their phonemic representations with reasonable accuracy. We show that using supervised learning techniques, based on a corpus of transcribed words, the same and even better performance can be achieved, without explicit modeling of linguist...

متن کامل

Solving the Phoneme Conflict in Grapheme-to-Phoneme Conversion Using a Two-Stage Neural Network-Based Approach

To achieve high quality output speech synthesis systems, data-driven grapheme-to-phoneme (G2P) conversion is usually used to generate the phonetic transcription of out-of-vocabulary (OOV) words. To improve the performance of G2P conversion, this paper deals with the problem of conflicting phonemes, where an input grapheme can, in the same context, produce many possible output phonemes at the sa...

متن کامل

Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

A grapheme-to-phoneme conversion (G2P) is very important in both speech recognition and synthesis. The existing Indonesian G2P based on pseudo nearest neighbour rule (PNNR) has two drawbacks: the grapheme encoding does not adapt all Indonesian phonemic rules and the PNNR should select a best phoneme from all possible conversions even though they can be filtered by some phonemic rules. In this p...

متن کامل

Alexia with and without agraphia: an assessment of two classical syndromes.

BACKGROUND Current cognitive models propose that multiple processes are involved in reading and writing. OBJECTIVE Our goal was to use linguistic analyses to clarify the cognitive dysfunction behind two classic alexic syndromes. METHODS We report four experiments on two patients, one with alexia without agraphia following occipitotemporal lesions, and one with alexia with agraphia from a le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013